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NEW KOLMOGOROV BOUNDS FOR FUNCTIONALS 
OF BINOMIAL POINT PROCESSES0 

by Raphael Lachieze-Re}ll and Giovanni PeccatH 

Abstract: We obtain explicit Berry-Esseen bounds in the Kolmogorov distance for the normal 
approximation of non-linear functionals of vectors of independent random variables. Our results 
are based on the use of Stein’s method and of random difference operators, and generalise the 
bounds recently obtained by Chatterjee (2008), concerning normal approximations in the Wasser- 
stein distance. In order to obtain lower bounds for variances, we also revisit the classical Hoeffding 
decompositions, for which we provide a new proof and a new representation. Several applications 
are discussed in detail: in particular, new Berry-Esseen bounds are obtained for set approximations 
with random tessellations, as well as for functionals of covering processes. 
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1 Introduction 

1.1 Overview 

Let X = {Xi, ...,Xn) be a collection of independent random variables, defined on some probability 
space (R,^,P) and taking values in some Polish space (E, let / : E” —)• M be a measurable 
function such that f{X) is square-integrable. The aim of the present paper is to deduce a new 
class of explicit upper bounds for the Kolmogorov distance dK{f{X),N), between the distribution 
of f{X) and that of a Gaussian random variable N ~ such that m = E/(A) and 

= Var/(A). Recall that dK{f{X),N) is defined as: 

dK{f{X), N) = sup |P[/(X) ^ t] - P[iV ^ t]| . 
teK 

The problem of obtaining explicit estimates on the distance between the distributions of f{X) and 
N has been recently dealt with in the paper [4], where the author was able to apply a standard 
version of Stein’s method (see e.g. dl) in order to deduce effective upper bounds on the Wasserstein 
distance 

dw{f{X),N) = sup|E[h(/(A))] - E[h{N )]\, 

h 

where the supremum runs over 1-Lipschitz functions, by using a class of difference operators that 
we shall explicitly describe in Section [2.11 below (see e.g. [SKIllEI] for some relevant applications 
of these bounds). 
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It is a well known fact that upper bounds on dw{f{X),N) also yield a (typically suboptimal) 
bound on dK{f{X)^N) via the standard relation dKifiX),N) ^ 2^^dw{f{X),N). The challenge 
we are setting ourselves in the present paper is to deduce upper bounds on dK{f{X),N) that are 
potentially of the same order as the bounds on dw{f{X),N) that can be deduced from [3]. Our 
main abstract findings appear in the statement of Theorem 14.21 below. In order to prove our main 
bounds, we shall exploit some novel estimates on the solution of the Stein’s equations associated 
with the Kolmogorov distance, that are strongly inspired by computations developed in 13126] in 
the framework of normal approximations for functionals of Poisson random measures. 

Another important contribution of the present work (ses Sectionis a novel representation (in 
terms of difference operators) of the kernels determining the Hoeffding decomposition (see e.g. |13( 
\22[ 129] . as well as |28l Chapter 5]) of a random variable of the type f{X). This new representation 
is put into use for deducing effective lower bounds on Var/(A). 

As demonstrated in the sections to follow, we are mainly interested by geometric applications 
and, in particular, by the normal approximation of geometric functionals whose dependency struc¬ 
ture can be assessed by using second order difference operators. One of the applications developed 
in detail in Section [6T] is that of Voronoi set approximations, where a given set K is estimated by the 
union of Voronoi cells. Remarkably, our bounds allow one to deduce normal approximation bounds 
for the volume approximation of sets K having a highly non-regular boundary. The present paper 
is associated with the work |18] . where it is proved that, for a large class of sets with self-similar 
boundary of dimension s > d — 1, the variance of the volume approximation is asymptotically of 
the same order as 77 ,“ 2 +s/d Kolmogorov distance between the volume approximation and 

the normal law is smaller than some multiple of multiplied by a logarithmic term. It turns 

out that the crucial feature for a set to be well behaved with respect to Voronoi approximation is 
its density at the boundary, which is mathematically independent of its fractal dimension (see |18] 
for an in-depth discussion of these phenomena). For illustrative purposes, we will also present an 
application of our methods to covering processes (re-obtaining the results of [n] in a slightly more 
general framework, see Section 16.21 belowl. as well as to some models already studied in [3] and 

m- 

In the recent reference [lOj . Gloria and Nolen have effectively used Theorem 14.21 below for 
deducing Berry-Esseen bounds in the Kolmogorov distance for the effective conductance on the 
discrete torus. 

1.2 Plan 

Section 2 contains our main results concerning decompositions of random variables. Section 3 deals 
with some estimates associated with Stein’s method, and Section 4 contains our main abstract 
findings. Section 5 focusses on estimates based on second order difference operators. Finally, 
several applications are developed in Section 6. 

From now on, every random object is defined on an adequate common probability space 
(n,^,P), with E denoting expectation with respect to P. 
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2 Decomposing random variables 

2.1 Some difference operators 

Let {E, S’) he a Polish space endowed with its Borel cr-held. Given two vectors y = {yi, ■■■,yn) £ E^ 
and y' = {y [,€ -B”, for every C C [n]:= {1,n} and every measurable function / : E’’^ R, 
we denote by f^{y,y') the quantity that is obtained from f{y) by replacing yi with y[ whenever 
i ^ C. For instance, if n = 4 and C = {1,4}, then 

f^{y,y') = f{yi,y2,y3,y4) 

and 

f^{y'-,y) = /(i/i,y2>y3>i/4)- 

Given C C [n], we introduce the operator 

Ac/(y,y') = f{y) - f^{y,y')- 

When C = {jj (to simplify the notation), we shall often write and A|j| = Aj, for 

j = 1,..., n, in such a way that 

^{j}f{y,y') = ^jf{y,y') = fiv) - fiy,y') = f{y) - fivi, ■■■,yj-i,yj,yj+i, ■■■,yn), 

and 

A{j}/(y',y) = Ajf{y',y) = f{y') - f{y',y) = f{y') - /(y},..., y'_i, y'+i, ...,y'n)- 

We can canonically iterate the operator Aj as follows: for every k >2 and every choice of distinct 
indices 1 ^ ii < ■ ■ ■ < ik ^ n, the quantity Aj^ • • • Ai^f{y, y'), is dehned as 

Ail ••• A4_J(y,y') - {Ai^ ■ ■ ■ Ai^_J{y,y'))i^, 

where (Aj^ • • • Ajj._j/(y, is obtained by replacing y^^ with y'^ inside the argument of 

A*i ••• A4_J(y,y'). 

Note that the operator Aj^ • • • Aj^, dehned in this way is invariant with respect to permutations of 
the indices ii, For instance, if n = 2, 

AiA 2 /(y,y') = A 2 Ai/(y,y') 

= /(y'i>2/2) - /(y'i>y2) - f{yi,y2) + f{yi,y2)- 

The notation introduced above also extends to random variables: if V = (Vi,...,V„) and X' = 
(X{, are two random vectors with values in E'^, then we write 

Acf{X,X'):=f{X)-f{X,X'), CC[n], 

and dehne Aj^ • • • Ai^/(W,X'), 1 ^ ii < ■ ■ ■ < ^ n, exactly as above. The dehnitions of 

Acf{X',X) and Aj^ • • • Ai^f{X',X) are given analogously. Now assume that E[\f{X)\] < oo. Our 
aim in this section is to discuss two representations of the quantity f{X) — E[f{X)], that are based 
on the use of the difference operators Aj. The hrst one is a reformulation of the classical Hoeffding 
decomposition for functions of independent random variables (see e.g. na ESI [291, as well as |28l 
Chapter 5]). The second one comes from [4] (see also (Sj Chapter 7]) and will play an important 
role in the derivation of our main estimates. 
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2.2 A new look at Hoeffding decompositions 

Throughout this section, for every fixed integer re > 1 we write X = [Xi , X^) to indicate a vector 
of independent random variables with values in the Polish space E, and let X' = (X(, be 

an independent copy of X. If / : E'^ ^ M is a measurable function such that E[/(X)^] < oo, then 
the classical theory of Hoeffding decompositions for functions of independent random variables (see 
e.g. mm) implies that f{X) admits a unique decomposition of the type 

n 

/(X) = E[/(X)] + ^ (2.1) 

k=l 1^21 

where the square-integrable kernels verify the degeneracy condition 

1 ■■■1 ^ik ) I Ail ’ Aia] ~ 

for any strict subset {ji,ja} of {ii, The derivation of (12.11) is customarily based on some 

implicit recursive application of the inclusion-exclusion principle, and the kernels can be 

represented as linear combinations of conditional expectations. As abundantly illustrated in the 
above-mentioned references, a representation such as (12.ip is extremely useful for analysing the 
variance of a wide range of random variables (in particular, [/-statistics). Our aim in the present 
section is to point out a very compact way of writing the decomposition ()2.1I) . that is based on 
the use of the operators Aj introduced above. Albeit not surprising, such an approach towards 
Hoeffding decompositions seems to be new and of independent interest, and will be quite useful 
in the present paper for explicitly deriving lower bounds on variances. Our starting point is the 
following statement, where we make use of the notation introduced in Section 12.11 

Lemma 2.1. For every f : E^ —)• R 

n 

f(ll)-f{y') = E, E (2.2) 

k=l 

Proof. The key observation is that, for every A: > 1 and every B = {ii, ...,[fc}, 

Ai^--- AiJ{y\y) = ^ (-1)1^1/^(y', y), 

ACB 

a relation that can be easily proved by recursion. By virtue of this fact, one can now rewrite the 
right-hand side of (12.21) as 

^ V^(A) X Z{A), (2.3) 

AC[n] 

where V^(A) := f^{y',y) and Z{A) := Es:S^0,acs (—Standard combinatorial considera¬ 
tions yield that Z{[n]) = 1, Z{^) = —1 and Z{A) = 0, for every non-empty strict subset of [re]. This 
implies that (|2.3p is indeed equal to V'([re]) — '0(0); and the desired conclusion follows at once. □ 

Now fix an integer re, as well as re-dimensional vectors X and X' as above (in particular, X' 
is an independent copy of X)\ the following statement provides an alternate description of the 
Hoeffding decomposition of f{X) in terms of the difference operators defined above. 
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Theorem 2.2 (HoefFding decompositions). Let f : —>■ M be such that E[f{X)‘^] < oo. One has 

the following representation for f{X): 

n 

f{X)=B[f{X)] + Y, E {-l)'^^[^n---\J{X',X)\X]. (2.4) 

k=l 

Formula (|2.4h coincides with the Hoeffding decomposition (12.ip of f{X): in particular, one has 
that, for any choice of ii, B[Ai^ ■ ■ ■ Aif^f{X', X)\X] = Xi,^), and consequently 

e{e [a,, • • • A, J(X', A)|X] X E [A,-, • • • A,J{X', A)|X] } = 0, (2.5) 

whenever {zi, ..., 4 } 7 ^ 

Proof. By Lemma l2.ll 

n 

f{X) = fix') + ^ ^ (-!)"= A,, • • • X,JiX',X), 

k=l 

and (|2.4p follows at once by taking conditional expectations with respect to X on both sides. To 
prove (ESI), it suffices to show the following stronger result: for every 1 ^ zi < ... < ^ n (all k 

indices different), 

E [A,, • • • A, J(X', A)| A,,,..., J = 0. 

This is a consequence of the following fact: the random variable Aj^ • • • Ajj._^/(A', A) is a function 
of Aj^,..., Ajj._^ and of A'. By independence, it follows that 

E [A,, • • • A,,_ J(A', A) I A,,,..., A,,_ J = E [(A,, • • • A,,_ J(A', A)),, | A,,,..., A,,_ J 

where the random variable (Aj^ ■ • • Ajj._^/(A', A))jj, has been obtained from Aj^ • • • Ajj._^/(A', A) 
by replacing A'^ with Aj^. Since (as already observed) 

A,, A,, • • • Ai,_JiX',X) = A,i • • • AiJiX',X), 

we deduce immediately the desired conclusion. □ 

The next statement is a direct consequence of (I2.4l) - (l2.5p . 


Corollary 2.3. Let /(A) be as in the statement of Theorem 12.21 Then, the variance of f{X) can 
be expanded as follows: 


Var(/(A)) = ^ ^ E 

k=l 


(E [A,,...A,J(A',A)|A])'' 


( 2 . 6 ) 


As a first application of (12.6p . we present a useful lower bound for variances. 
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Corollary 2.4. Let f{X) be as in the statement of Theorem 12.21 Then, one has the lower bound 

n 

Var(/(X)) > f(E [AJiX', X)\X]f 


2=1 


In particular, if X = {Xi,... ,Xn) is a collection of n i.i.d. random variables with common distri¬ 
bution equal to p, and f : —)• M is a symmetric mapping such that E[f{X)‘^] < oo, then 

Var(/(X)) >n [ (E[/(X) - f{x, X 2 ,..., X,)])' pidx). 

Je 

Remark 2.5. The estimates in Corollary 12.41 should be compared with the classical Efron-Stein 
inequality (see e.g. [U Chapter 3]), stating that 

1 ^ 

Var(/(X))^-^E[A,/(X,X')2], 


which, in the case where the Xi are i.i.d. and / is symmetric, becomes 

Var(/(A))^^ / E[U{X) - f{x,X 2 ,... ,Xn)f]p{dx). 

^ Je 

For instance, if f{X) = Xi + • • • + X^ is a sum of real-valued independent and square-integrable 
random variables, then the Efron-Stein upper bounds coincides with the lower bound in Corollary 
12.41 that is: 


E 


E 


(E [A,f{X',X)\X]y 


2=1 


-t El n 

- J]e [A,/(X,X')2] = Var(A,). 
2=1 2 = 1 


Heuristically, in the general case where the Xi are i.i.d. and / is symmetric, it seems that, in 
order for the Efron-Stein upper bound and the lower bound of Corollary 12.41 to have the same 
magnitude, it is necessary that the functional f{X) is not homogeneous, meaning that the law of 
f{X) — f{x, X 2 , ..., Xn) depends on x. Examples of such a behaviour will be described in Section 
16.11 where we will deal with Voronoi approximations. 


2.3 Another subset-based interpolation 

Let n > 1, let / : E” —R, and let y, y' G E". In [1], the following formula is pointed out: 

f{y) - f{y') = y'), (2.7) 

AC[n] V|A|E” jfA 

where the vector y^ has been obtained from y by replacing yi with y'- whenever i ^ A, in such way 
that, with our notation, Ajf{y^,y') = f{y^) - = f^(y,y') - y')- 

Now consider a vector X = {Xi, ...,Xn), with independent components and with values in E"', 
and let X' be an independent copy of X. For every AC [n], we define X^ = {Xf ,..., X^) according 
to the above convention, that is: 

^^^ix.ifi^A 

* lx' otherwise. 

The following statement is a direct consequence of (12.71) . 
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Proposition 2.6 (See [1], Lemma 2.3). For every f,g : A"- — )• Wsuch that E[f E[g{X)‘^] < oo, 

Cov{fiX),g{X)) = i ^ I , 1 , Y.^[A,g{X,X')Ajf{X^,X% (2.8) 

^AC[n]M^K''^ I j^A 

To simplify the notation, we shall sometimes write 

(|A|)(--I^l) " 

Observe that, for every j, '^Ac[n]:j^A ^n,A = 1 

Remark 2.7. As demonstrated in [5l Lemmas 7.8-7.10], the identity (|2.8p can also be used to 
deduce effective lower bounds on variances. Such lower bounds seem to have a different nature 
from the ones that can be proved by means of Hoeffding decompositions. 


3 Stein’s method and a new approximate Taylor ex¬ 
pansion 

Let U and V be two real-valued random variables. The Kolmogorov distance between the distribu¬ 
tions of U and V is given by 


dK{U,V) = sup|P([/ ^ t) -P(P ^ t)|. 

teK 

As anticipated in the Introduction, our aim in this paper is to provide upper bounds for quan¬ 
tities of the type dK{W, N), where W = f{X) and is a standard Gaussian random variable, that 
are based on the use of Stein’s method. The following statement gathers together some classical 
facts concerning Stein’s equations and their solutions (see Points (a)-(e) below), together with a 
new important approximate Taylor expansion for solutions of Stein’s equations, that we partially 
extrapolated from reference [7] (see Point (f) below), generalising previous findings from [26]; see 
also [21 Theorem 2]. 

Proposition 3.1. Let N ~ AA(0,1) be a centred Gaussian random variable with variance 1 and, 
for every t E M, consider the Stein’s equation 

g'{w)-wg{w) = ^t), (3.1) 

where w £ M. Then, for every real t, there exists a function : M —)• M : re i—?■ gt{w) with the 
following properties: 

(a) gt is continuous at every point rc E M, and infinitely differentiable at every w ^ t; 

(b) gt satisfies the relation (13.11) . for every w ^ t; 

(c) 0 < gt^c := 
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(d) for every u,v,w E M, 


\{w + u)gt{w + u) - (w + v)gt{w + v)| ^ |ui| + 




(kl + 1^1); 


(3.2) 


(e) adopting the convention 

a'M ■= tgt{t) + 1 - P(iV ^ t), 
one has that \g[{w)\ ^ 1, for every real w. 

(f) using again the convention (13.3p . for all Wjh one has that 

\gt{w + h)- gt{w) - gt{w)h\ ^-^1^1 + J 


(3.3) 


(3.4) 


^ ^ (l»l + (3.5, 

+ h (l(^) '^[w-\-h,w m ■ 

Proof. The proofs of Points (a)-(e) are classical, and can be found e.g. in m Lemma 2.3]. We will 
prove (f) by following the same line of reasoning adopted in [3 Proof of Theorem 3.1]. Fix f E M, 
recall the convention (|3.3I) and observe that, for every tc, /i E M, we can write 

r-h 

gt{w + h) - gt{w) - hg[{w) = / {g[{w + u) - g'{w)) du. 

Jo 

Since gt solves the Stein’s equation (13.11) for every real w, we have that, for all tc, /i E M, 
gt{w + h) - gt{w) - hg[{w) 

rh rh 

= {{w + u)gtiw + u)-wgtiw))du+ dn := Ji +/ 2 - 

Jo Jo 

It follows that, by the triangle inequality, 

\gt{w + h) - gt{w) - hg[{x)\ ^ J/i] + I/ 2 I. (3.6) 

Using ()3.2p . we have 


(ki + ^j = Y (1^1+^ 


(3.7) 


Furthermore, observe that 


/■/i 

Jo 


+ 1 


{/x>0} 


172] = l{/i<0} 

= 1 

ru rn 

Jh Jo 









/ ^{w+u^t<w}^^ 

Jh 

+ l{/i>0} 

Jo 















Bounding tt by /i in both integrals provides the following upper bound: 

|-^ 2 | ^ (^) ^{h>0}h^[w,w+h)(j'^ 

^ /l (l[iO|U)+/i)(0 ^[io+/i,iu)(^)) 1^1 (^[«),u)+/i)(^) 1 [ui+/i,io)(^)) • (^•^) 

Applying the estimates (|3.7n and (I3.8p to ()3.6p concludes the proof. □ 

An immediate consequence of Proposition 13.II is that for N ~ AA(0,1) and for every real-valued 
random variable W, one has that 

dK{W, N) = sup \Bg[{W) - WgtiW)\ (3.9) 

teK 

(observe in particular that convention (|3.3p defines unambiguously the quantity ^((x) for every 
t, X G M) . 

4 New Berry-Esseen bounds in the Kolmogorov 
distance 

Let n > 1 be an integer, and consider a a vector X = (Xi,..., X„) of independent random variables 
with values in the Polish space E. Let X' = (X(,... ,X') be an independent copy of X. Consider 
a function / : —)• M such that W := f{X) is a centred and square-integrable random variable. 
We shall adopt the same notation introduced in Sections 12.1112.2112.31 and [3l For every AC [n], we 
write 

Ta = ^A^f(X,X')A^f(X^,X') 

n = ^A,/(X,X')|A,/(X^,X')| 

HA 

and 

F = ^ ^ Kn,ATA, 

Ac[n] 

F = — Kn,ATA- 

Ac[n] 

Observe that each is a sum of symmetric random variables in such way that 0 = E[T'] = E[r(j], 
A C [re]. 

Remark 4.1. An immediate application of (j2.8|] implies that Var(/(X)) = E[T]. We stress that 
the random variables Ta and T already appear in [3], in the context of normal approximations in 
the Wasserstein distance. Our use of the class of random objects {T',T^ : A C [re]} for deducing 
bounds in the Kolmogorov distance is new. 

The next statement is the main abstract finding of the paper. 
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Theorem 4.2. Let the assumptions and notation of the present section prevail, let N AA(0,1), 
and assume that FiW = 0 and = <7^ G (0, oo). Then, 

dK{a-^W,N) ^ ^VVar(E(r|X)) + ^ VVar(E (r'|X)) (4.1) 

jAJ^A 

1=1 

^ ^v'Var(E(r|X)) + ^^Yar(E iT'\X)) (4.2) 

-I ^ /7^— ^ 

+ 4^ E \/e|A,/(A',X')|» + E E| A,/(A,X')|=>. 

1=1 1=1 

Proof. By homogeneity, we can assume that <7 = 1, without loss of generality. By virtue of (13.91) . 
the Kolmogorov distance between W and N is the supremum over t € [0,1] of 

\Bg'{W) - Wgt{W)\ ^ B\g[{W) - g[iW)T\ + \B{gt{W)W - g'tiW)T)\, (4.3) 

where the derivative g'i{w) is defined for every real w, thanks to the convention p.3p . Since W is 
i7(X)-measurable, 1^(1 ^ 1 and ET = EVE^ = 1, one infers that 

ng'tiW) - g't{W)T\ ^ n\g't{W) X E[r - 1 I X]|] ^ E|E[T - l | X]| ^ v'Var(E(r|X)). 

Our aim is now to show that the quantity \Ej{gt{W)W — g[{W)T)\ is bounded by the last three 
summands on the right-hand side of (14.Ih (with a = T). Reasoning as in [4], the relation (12.8p 
applied to 'Ejgt{W)W and the definition of T yield 


with 


\^gt{W)W - g[{W)T\ 


2 E(i?^j RA,j) 

A<Z[n\ j^A 



'y ^ ^n,A j RAjJIj 

AC[n] j^A 


RA,j = Aji{gtof){X))A,f{X^), 
RA,j=g'tif{X))Ajf{X)A,f{X^), 

where, here and for the rest of the proof, we use the simplified notation Ajf{X^) = Ajf{X^,X'), 
Ajf{X) = Ajf{X,X'), and so on. We have 

E|R^,,- - Ra,j\ = B[\gt{f{X) - A,f{X)) - gt{f{X)) - g[{f{X)){-A,f{X))\ x | A,-/(X^)|]. 
Now we use (13.5p with w = f{X), h = —Ajf{X), together with the fact that 
^ (^) l[ui+/i,«))(^)) f{w+h>t}') 
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to deduce that 


\n9tmw -g't{w)T]\^\¥. Y. Kn,A 




+ V^/4) 


\AjfiX)\^\AjfiX^)\ 


+ \A]f{X -*)\}■ 


Using the independence of X and X', one proves immediately that, for j ^ A, 

EA, {lf^x)>t) A,/(X) |A,/(A^)| = 2E1^(^)>,A,/(A) |A,/(A^)| , 

from which it follows that the right-hand side of (14.411 is bounded by 

Ie ^ Kn,A(\fiX)\ + ^]\A,fiXfAjf{X^)\ + \E[lf^x)>txT']\ 

j,A,jtA \ / 

^^E ^ Kn,A (\f{X)\ + |A,/(X)2a,/(X^)| + VVar(E(r'| A)), 

j,A,jtA V / 


(4.4) 


where we have applied the Cauchy-Schwartz inequality, together with the fact that indicator func¬ 
tions are bounded by 1. The bound (14.111 is obtained by using the Holder inequality in order to 
deduce that, for all j, A, 

E|A,/(A)|2|A,/(X^)| ^ E|A,/(A)|3, 

and (14.211 follows by 

E|/(A)||A,/(A)|2|A,/(A^)| ^ Ym^^^A,f{XYA,f{X^Y 

< V^(EA,/(A)4(3/2))2/3(EA,/(A^))2(3))1/3 ^ {EA^f{Xf fl\ 


where we have used the fact that X and X^ have the same distribution. □ 

Remark 4.3. Recall that the Wasserstein distance between the laws of two real-valued random 
variables U, V is defined as 


dw{U, V) := sup \B[h{U)] - E[h{V )]\, 

h 

where the supremum runs over all 1-Lipschitz functions /i : M —)• M. In [H Theorem 2.2], one can 
find the following bound: under the assumptions of Theorem 14.21 

1 _ 1 

dw{W,N) ^ ^v'Var(E(r|X)) + ^^E|A,/(A,A')|3. (4.5) 

j=i 
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Example 4.4. Consider a vector X = {Xi, ...,Xn} of i.i.d. random variables with mean zero and 
variance 1, and assume that E|Xi| < oo. Define W = f{X) = n + • • • + Xn)- It is easily 

seen that, in this case, for every j ^ ^4, Ajf(X"^,X') = n~^l‘^{Xj — X^), in such a way that 

- 71 1 ^ 

“<i T’ = -Y,sign{X,-X’)(Xi-X’f. 

i=i i=i 

We also have, denoting X^ the vector X after removing Xj, 

E|/(X)A,-/(X)2a,-/(X^)KE|/(X)-/(W)||A,/(X)%-/(X^)|+E|/(W)|E|A,/(X)2||A,-/(X^)| 

^ En“2|Aj||Aj - X'j\^\Xj - X'j\ + E\f{X^)\En-^/^\Xj - X'/\Xj - Xj| 

^ 8in-^EXf + n-^/^EXf). 

(note that the bound (I4.2p can be used instead, whenever EA® < oo). An elementary application 
of (14.11) yields therefore that there exists a finite constant C > 0, independent of n, such that 

dK{W,N) ^ 

y/n 

providing a rate of convergence that is consistent with the usual Berry-Esseen estimates. One 
should notice that the estimate (j4.5j) yields the similar bound dw{W,N) ^ Cjy/n. 


5 Symmetric functions and geometric applications 

In this section we adapt our results to random structures with local dependence, in a spirit close to 
[H Section 2.3] - see Remark 15.41 below. Our principal focus will be on measurable and symmetric 
real-valued mappings / on E^: we recall that / : —>■ M is said to be symmetric if 

f j ■ ■ ■ j ^cr(n)) f (®1) • • • ) ^n) 

for any permutation a of [n] and vector x G E^. 

In the following, X and X' denote two independent sets of n i.i.d. random variables with 
common generic distribution /i. We will use the following short-hand notation: for any random 
vector Z of dimension n, and for every 1 ^ i 7 ^ j ^ n, 

AJ{Z) := A,/(Z, A'), A„/(Z):= A,A,/(A, A'), 

where the notation is the same as in Section EH we also adopt the additional convention that 
Ai^i = Aj. Now let A be a further independent copy of A. We shall use the following terminology: 
a vector Z = (Zi, ...,Zn) is a recombination of {A, A', A}, if Z* G {Aj, A', Aj} for every 1 ^ i ^ n. 

The next statement provides a bound for the normal approximation of geometric functionals 
that is amenable to geometric analysis, and can be heuristically regarded as the binomial counter¬ 
part to the second order Poincare inequalities on the Poisson space (in the Kolmogorov distance), 
proved in m- 
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Theorem 5.1. Let f : -E" —)• M be a symmetric measurable functional such that W = f{X) is 
centred, and = Var(Vh) < oo. Let N be a centred Gaussian random variable with variance 1. 
Define 


BM)-= sup 
{Y,Z,Z') 

B'nif) ■= sup E [l{Ai2/(y)?^0,Ai3/(>"')^0}A2/(E)^A3/(Z')^] , 
{¥,¥',Z,Z') 


where the suprema run over all vectors Y, Y', Z, Z' that are recombinations of {X, X' ,X}. Then 

4\/2n^/^ 


dK{(T-^W,N) ^ 




+ Vn^BUf) + VEAi/(A)4 


(5.1) 


+£. sup E|/(A)Ai/(A^) 3| + ^nE|Ai/(A) 


4(j^ 


AC[n] 


16cj^ 


Remark 5.2. We shall often use the following bounds, following at once from the Cauchy-Schwartz 
inequality, 


B'nif) ^ sup WE [l|/^ j(y)^0,Ai,3/(^')^0}A2/(-Z’)4] E [l|^ 2/(y)^0,Ai.3/(l"')^0}A3/(-Z’' 

{¥,¥',Z,Z') ^ 




^ sup Elj^j 2 j(y)^ 0 ,Al 3/Ci^')7^0}A2/(E)"‘ 

(¥,¥',Z) 


(5.2) 


and 


Bn{f) ^ sup E [l{Ai2/(y)?^o}Ai/(2')'‘] . (5.3) 

{Y,Z) 

In the framework of the applications developed in this paper, such estimates simplify some compu¬ 
tations and do not worsen the associated rates of convergence. 

In the applications developed below, we will often consider functions / that are obtained as 
restrictions to E” of general real-valued mappings on the set Un>iE", corresponding to the class 
of all finite ordered point configurations (with possible repetitions). Now fix / : Un>iE" ^ M 
and, for every n > 1 and every x = {xi,...,Xn) G E^, introduce the notation x* to indicate the 
element of E"'“^ obtained by deleting the fth coordinate of x, that is: x* = (xi, ...,Xi_i,Xi, ...,Xn). 
Analogously, write x*-^ G E””^ to denote the vector obtained from x by removing its i-th and j-th 
coordinates. We write 


Dif{x) = /(x) - /(x*), 

A,,/(A) = fix) - /(x*) - f{x^) + /(x*^) = D,4{x). 
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Proposition 5.3. Let f be a functional defined on such that its restriction to satisfies 

the hypotheses of Theorem 15.11 Then we have 


-^U/) ^ 2® sup E [l{^ 2/(y)7^0}l{Di,3/(Y')^0}-D2/(.^)^1^2/(.^')^] 

{Y,Y',Z,Z') 

sup E[l{n,,fiY)mDifiZfD 2 fiZ')^]. 

{Y,Z,Z') 


Proof. First observe that 

\A,f{X)\ ^ \D,fiX)\ + \D,f{Xn\ (5.4) 

A,,,/(X) = D,J{X) - A,,/(X*) - A,,/(X^) + (5.5) 


Let Y, Y', Z, Z' be recombinations of {X, X',X}. Using the bounds above, there are recombinations 
Yii)y,{i)^i = 1,... ,4 and = 1,2, such that 


E [ 1 {Ai, 2 /(Y)^ 0 ,Ai, 3 /(Y') 7 ^ 0 }^ 2 / {Z') 

< E 


y\2l 


E E 1(D,./,V,(..),.0| E 4ft/(Z">)^D3(Z'''“>)= 

i=l j=l l,m=l 

< 256 sup E [1{£|^ 2/(y)^0}1{Di 3 /(Y') 7 ^o} A/(^)^A/(-^0^] ) 

(Y,Y',Z,Z') 


which gives the bound on The bound on Bn{f) is obtained analogously. □ 

Remark 5.4. Our framework is more restrictive than that of [H Theorem 2.5], where it is not 
assumed that / is symmetric, but rather that its dependency graph is symmetric, meaning that 
the relation Aijf{X) = 0 is equivalent to (,.q)/(X'^) = 0 for any i ^ j and every permutation 
a of {1, ...,n}, where Xf := One should notice that this subtlety is not exploited in most 

applications of [1] - see e.g. [2T]. Under our symmetry assumption, a bound analogous to the main 
estimate in [U Theorem 2.5] can be retrieved from (15.Ij) by using the bounds 


yjEAjfiXf + + Vn^BUf) 

< S^EAjfiX)^ + nBnif) + n^B'^if) 


< 3. 




1 ^ sup El{Aij/(Y)^ 0 }l{Ai,fe/(Y')< 0 }^l/(^)^^fc/(^')^ 


n 

<6^2 V sup n-2E(sup|Aj/(Z)|)45i(y)5i(y') 

\,^k=l^Y,Y',Z) 

< QV2{EM{Xff/^{E5i{Xf f/‘^ 


where M{X) = supj|Aj/(X)| and (5i(X) = : Aijf{X) / 0}. One should notice that the 

additional term involving quantities of the type E|/(X)Ai/(X)^Ai/(X"^)| appears in our bounds 
because we are dealing with the Kolmogorov distance: in general, we shall control this term by 
using the rough estimate E|/(X)Ai/(X)2Ai/(X"^)| < ay/EAjf{X)^, that one can e.g. deduce 
by applying twice the Cauchy-Schwartz inequality - see Section [6] for more details. 


14 













Proof of Theorem \5.1[ Assume without loss of generality that cr = 1. Our estimate follows by 
appropriately bounding each of the four summands appearing on the right-hand side of (|4.ip . We 
have for A C [n], 1 ^ j ^ n, by Holder inequality, 

B\f{X)A,f{XfA,fiX^)\ = E|/(A)2/3a,/(A)2||A,/(A)V3a,/(A^)| 

^ {B\f{X)A,fiXf\f^ (E|/(A)A,/(A^)3|)'/=' 

^ sup E|/(A)A,/(A^)3|, 

AC[n] 

because Ajf{X) = Aj/(A®). The two last terms on the right-hand side of (14.11) are there¬ 
fore bounded by the last two terms in (15.1|) . in view of the symmetry of / and of the relation 
YliA<Z[n\ ■ i^A '^n,A = 1- To control the first two summands in (|4.ip . we first bound the square root 
of the variance of a random variable of the type U := ^ Syic[n] '^n,AUAi for a general family of 
square-integrable random variables Ua{X,X'),A C [n]. Using e.g. [H Lemma 4.4], we infer that 

V^Var (E(t/|A)) ^ ^ ^ Kn,A\/VarE (UAX) < i VE {\s.r{UA\X')). (5.6) 

ylc[n] 

This inequality will be used both for Ua = T 4 and Ua = Let us now bound each summand 
separately. Fix A C [n]. Introduce the substitution operator based on A = 

5i(A) = (Ai,...,Ai,...,A„). 

Recall that, by the Efron-Stein’s inequality, for any square-integrable functional Z{Xi^ ..., A„), 

1 

Var(Z)<-^E(A,Z(A ))2 

i=\ 

where 


{A^Z){X) := Z{Si{X)) - Z{X) 

is clearly centred. Applying this to Z{X) = Ua{X,X') for fixed A', 


Ys.v{Ua\X') 


i=l 


AiUA{X,X')] I A' 


From this relation, we therefore infer that 


\/Var(E(C/|A))^— ^ 


l^n,A ^ 


AC[n] 


^ i=l 


^E(AiUA 


Now recall that Ua = Ta or Ua = T^i, i-e. Ua = where either g is the 

identity or g{-) = \ ■ |. Expanding the square yields 

n 2 ^ 

J2^[a,Ua) =E E E|A,(A,/(A)5(A,/(A^)))||A,(Afc/(A)5(Afc/(A^)))|. (5.7) 

i=l i=l j,k^A 
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Now fix 1 ^ ^ n, write X® = Si{X) and observe that for j ^ A, 

A,iA,f{X)giAjfiX^))) = AiiA,f{X))g{Ajf{X^)) + A,/(X®)A,(5(A,/(X^))). (5.8) 

We note immediately that, in the case i = j, using I Ajg'(f/(X))| ^ |Aj(y(A))| and Aj(Ai(y(A))) = 
Aj(y(A)) for any random variable V{X), the right-hand side of (I5.8p is bounded by the simpler 
expression 


1 


\AJ{X)AifiX^)\ + \Aif{X^)Aif{X^)\ ^ ^ [AJiXf + Aif{X^f + AJ{X^f + A,/(A^) 


(5.9) 


Now let us examine each summand appearing in (15.7h separately. \i i ^ A and i = j = k, using 
(j5.9p . the summand is smaller than 


-E 


AifiXf + A,f{xy + Aif{X^f + Aif{X^f < AEAif{X)\ 


rA\2 


In the case where i,j, k are pairwise distinct, introduce the vector X by 

i Xi =Xi 
{ Xi = X[ if / / i, 

and, for x E and some mapping '0 on E*®, define, for 1 ^ ^ n, 

Anp{x) = ljj{x) - Ipixi, . . . , Xl-l,Xl,Xl+l, Xn). 
Then, the corresponding summands are bounded by 

4 sup E\A,{AjfiY))Ajf{Y')AiiAkfiZ))Akf{Z')\. 

{Y,Y',Z,Z') 


Using X = X' and the fact that if T is a recombination, switching the roles of Xi and X[ in Y 
still yields a recombination of {A, A', A}, the previous expression is bounded by 

= 4 sup B\Ai{Ajf{Y))Ajfir)Ai{Akf{Z))Akf{Z')\ 

{Y,Y',Z,Z') 

^ 4 sup El{A^^.^(y)^o}(|A,-/(y)| + |A,/(W)|)|A,-/(y')|x 

(Y,Y',Z,Z') 

X 1{a,,/(z)^o}(|Aa:/(A)| + |A,/(A)|)|Afc/(Z')l 

^ i6i?;(/), 

where we have used Cauchy-Schwarz inequality. The case i ^ j = k is treated with the same 
vector A and operators A;. Using similar computations and Cauchy-Schwarz inequality, we have 
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the upper bound 


4 sup BAi{Ajf{Y))Ajf{Y')Ai{A,f{Z))Ajf{Z’) 

(Y,Y',Z,Z') 

^ 4 sup [BAiiA,fiY)fAjfiY'f] 

(Y,Y') 

= 4 sup [BAjiAJ{Y)fAjf{rf] 

{Y,Y') 

^4 sup El{A^^.;(y)^0}(|A./(y)| + |A,/(y^)|)2A,/(y')' 

{Y,Y') 

^16 sup El{A,.^(y)^0}Ai/(y)2A,/(y')2 
{Y,Y',Z) 

^ IQBnif), 

where the suprema run over recombinations Y,Y', Z, Z' of {X,X',X}. Finally, if i = j ^ k, the 
corresponding summands on the right-hand side of (15.7p are bounded by 

4 sup E|A,/(y)2Ai(Afc/(y'))Afc/(y)| 

{Y,Y',Z) 

^4 sup Ei|A^,^(y,)^0}(|Afc/(y)| + |Afc/(y*)|)A,/(y)2|Afc/(y)| 

(Y,Y',Z) 

^ 8Bn{f). 


This yields 

n 2 

E (^AiUA^ ^ 16n ^ [l{j=fc=i}EAi/(A)^ -|- (l{fc^j=i} -|- 
i=l j,kfA 

^ 16n (l|i^^|Ai/(A)" + 2(n - \A\)BM) + (n -\A\fB'M)) , 
and using the inequality ^/aXFy ^ ^/x + ^/y {x,y > 0) we deduce that 


11' ^ \ / 

^b(^A,Ua) ^ VEAi/(A)4 + + ^/Wjr){n - 

i=l 



Finally, 

VVar(E(17|A)) ^ 

yM I v'EAi/(A)4 E + '/XU) E + AXU) E ''’Un - |A|) 

y AC[n]:l^A 

and the result follows by evaluating the three sums over A C [n] in the last expression. 


□ 
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6 Applications 

6.1 Set approximation with random tessellations 

Let iL be a compact subset of with positive volume, and let X = (Xi) be a locally finite 
collection of points. Assume the only information available about K is given by the values of the 
indicator function Then, the Voronoi reconstruction, or Voronoi approximation, of 

K based on X is defined as 

= {y G : the closest point from y in X lies in K}. 

This chapter is devoted to the study of the error committed when one approximates the volume of 
K C [0,1]*^ with that of , when A is a random input consisting in n i.i.d points in [0, l]*^. 

The underlying structure in this approximation scheme is the Voronoi tessellation based on X. 
For X G [0,1]'^, denote by V(x;A) the Voronoi cell with nucleus x among X, i.e. the convex set 
formed by points y G [0,1]'^ such that \\y — x|| ^ ||y — a:'|| for any point x' G {X,x), where in all 
this section (A, x) := A U {x} , and we extend the set notation G to ordered collections of points 
in an obvious way. The volume approximation described above is denoted 


^{X) = Vol(A^) = ^l{x,ei^}Vol(V(Ai; A)). 

i 


Along the same lines, one can also approximate the perimeter of K via the relation ip-per{X) = 
Vol(A^AA) where A denotes the symmetric difference of sets. 

This set approximation can serve in image reconstruction and estimation: it has first been 
introduced by Einmahl and Khmaladze [8] as a discriminating statistic in the two-sample problem. 
These authors proved a strong law of large numbers in dimension 1. Heveling and Reitzner m 
proved that if K is convex and compact and A = A' is a homogeneous Poisson process with intensity 
n, E(/?(A') = Vol(A), and Var((^(A')) ^ where c is an explicit constant and S{K) 

is the surface area of K. They also established that E(^Per(A') = dn~^/^S{K)(\ + 0{n~^/^)) and 
Var((^Per(A')) ^ Reitzner, Spodarev and Zaporozhets [23] extended these results 

to sets with finite variational perimeter, and also gave upper bounds for E|y?(A')'^ — Vol(A)^| for 
q > 1. Schulte m proved a similar lower bound for the variance, i.e. CS{K)n ^ ^ Var((^(A')) 

with K a convex body and C a universal constant, and the corresponding CLT 


U{X')-^y,{X') 
\ VVar(:^(A')) 



^ 0 . 


Yukich m then gave an upper bound on the speed of convergence in Kolmogorov distance. 

For Binomial input, Penrose proved that for measurable K and A consisting in n iid variables 
with density k{x) > 0 on [0,1]'^, 


Bip{X) Vol(A), (6.1) 

without assumption on K, not even the negligibility of its boundary. Yukich |3T] managed to extend 
to a non-Poissonized setting the estimates on the variance magnitude as well as the central limit 
theorem for the Volume approximation. See also [3] for a result involving the Hausdorff distance. 
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In this section, we consider a binomial input X = {Xi,, X^), where the X^ are n iid variables 
uniformly distributed on [0,1]'^. We give asymptotic upper bounds for the moments of ip(X) — 
E(/7(X), as well as a central limit theorem with rates of convergence in the Kolmogorov distance, that 
is new in the literature. Note that, in the words of Reveling and Reitzner |12].“the general problem 
whether approximates K for complicated sets seems to be difficult”, and many applications 
of set approximation are concerned with the detection or approximation of sets with an irregular 
boundary, see for instance [6] or the survey [T6l Chap. 11]. Our results also hold for large classes 
of irregular sets, with a possibly fractal boundary. The regularity of the boundary of K will be 
assessed in terms of the following quantities. Call below Lebesgue-boundary of K, written dK, the 
class of points x such that for all e > 0, Vol(R(x, e) C KT) > 0 and Xol{B{x,£)riK‘^) > 0. Let /3 > 0. 
Denote by d{x, A) the Euclidean distance from a point x G to a subset R C Define 

dK'^ = {x : d{x,dK) ^ r} 
dK\. = K^n dK^ 

f /Vol(fi(.,gr)nA-) 

JdKl V ^ 

K is said to satisfy the weak rolling ball condition if 

7 (K) := liminf Vol(9iL”)“^(7(iL, r) + 'y{K‘^,r)) > 0. (6.2) 

r>0 

This assumption somehow implies that either K or occupies a constant positive proportion 
of space as one zooms in on a typical point close to dX, at least in a non-negligible region of [0,1]*^. 
It is related to a weak form of the rolling ball condition used in set estimation (see for instance 
condition (a) of Theorem 1 in [6], the definition of standard sets in [25], Remark 4 in [27|, or the 
survey |16l Chap. 11] and references therein), where for each x € dK a ball of radius fir touching 
X should lie in d{K^)'f. or dKf. In our weaker form of the condition, the ball is somehow allowed 
to be deformed to fit in the parallel body. It certainly allows sets which boundary is smooth in a 
certain sense, and does not discard a priori fractal sets. It is proved in m that a class of fractal 
sets including for instance the 2-dimensional Von Koch flake and antiflake satisfy the condition, as 
well as the hypotheses of the following theorem with a = 2 — s, s = log(4)/log(3) being the fractal 
dimension of the boundary. 

Theorem 6.1. Let K C [0,1]*^ such that 

Vol{dK^) ^ S+(K)r“ (6.3) 

for some S+{K), a > 0. Then for n,q > 1, 

E1(^(V) - Eip{X)\i ^ S+{K)Cd,q,an-i/^-^/^ (6.4) 

for some Cd,q,oL > 0 explicit in the proof. If furthermore K satisfies the weak rolling ball condition 
(I6.2|) and 



Vol{dK^) > S-{K)r^ 


(6.5) 
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for some S-{K) > 0, then for n sufficiently large 


CjS.{Kh{K) ^ 


Var((^(K,X)) 




^ C+S+{K)Cd 


,2,a; 


for some , Cj" > 0, and for every e > 0, there is > 0 not depending on n such that 

/ V;( X) -Eypffl \ y2+a/2d-^^ ^^^3+a/d+e 

^ [ VVar(v.(X)) ’ y ^ 

for n > 1. 

Remarks 1. 1. The previous theorem also applies to smooth sets. Blashke’s theorem (see for 

instance m Theorem 1]), yields that any manifold K with Lipschitz normal admits 
inside and outside rolling balls in the traditional sense, and satisfies in particular our weak 
rolling ball condition. Furthermore, such a set and its complement have positive reach, which 
proves by Steiner formula that the upper and lower bounds (16.31) . (j6.5p are satisfied, see the 
pioneering work of Federer [U] . The result might still hold if the boundary is only piecewise 
regular, see for instance Remark 4 in [27!. 

2. If ()6.2p is not satisfied, we can still get a lower bound on the variance (and therefore a rate 
of convergence), but its magnitude will not match that of the upper bound, see Lemma 16.81 
It might be difficult for such a set to get a clear estimate of the variance. See also the 
counterexample in |18j . 

3. The constant (3 in the rolling ball condition is left at our choice. The larger /3, the easier it 
is for K to verify the condition. 

4. Conditions (j6.3p and (j6.5p imply that K has Minkowski dimension equal to d — a, and 

furthermore that K has lower and upper Minkowski content (see for instance [E]). Self 
similar sets satisfy these hypotheses, and are treated in [18], as well as some examples, such 
as the Von Koch flake, that also satisfies the weak rolling ball condition. We provide as well 
the example of a set K with lower and upper Minkowski content for a = 1/2 that does not 
satisfy the rolling ball condition. Simnlations indicate that for this example the variance is 
indeed negligible with respect to but it is still possible to get a rate of convergence 

for Kolmogorov distance to the normal law. 

5. The nniformity of the distribntion of the W’s does not have a crucial importance, apart 
from easing certain geometric estimates. The results should hold, up to constants, if the 
common distribution of the Vj’s is only assnmed to have a density bounded from below by 
some constant k > 0 on the domain dK^, for some r > 0. 

6. The Berry-Essen bounds is derived from (15.ip . It tnrns out that each of the terms on the 
right hand side of (j5.1h contribntes with the same power of n, heuristically indicating that 
this power is likely to be optimal. 

The proof of the theorem is decomposed into several independent results. The variance lower 
bonnd is established in the specific framework of Voronoi volume approximation. The Kolmogorov 
distance and moments upper bounds are potentially valid in a more general framework. 
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Theorem 6.2. Define cr^ = Var((/9(X)). Assume that Vol{dK^) ^ 5+(K)r“ for some S^{K),a > 
0. Then (I6.4|) holds, and for every e > 0 there is a constant not depending on n such that for 
n> 1, 

dx - E(^(X)),iV) ^ Ce log(n)3+“/2rf+£ 

( 6 . 6 ) 


where N is a standard Gaussian variable. 

Say that two points x,y £ [0,1]®* are Voronoi neighbours among a point set X if V{x;X) n 
V{y]X) / 0. More generally, denote dv{x,y,X) the Voronoi distance between x and y, i.e. the 
minimal integer k > 1 such that we can form a path xq = x]Xi £ X,., Xk-i £ X,Xk = y where 
Xi and Xj+i are Voronoi neighbours. Denote v{x,y,X) = Vol(V(x, {X,y)) n V{y,X)) the volume 
that the cell V{y,X) loses when x is added to X. We have the explicit expression, for x ^ X, 

ip{X,x) - y{X) = v{x,y,X) - ^ v{x,y;X). (6.7) 

yeXnK^ y&XnK 

Since v{x, y, X) = 0 if a: and y are not Voronoi neighbours in (V, x, y), the concatenation of X with 
X and y, the following properties hold. 

Proposition 6.3. Let X = {Xi)i^i^n be a finite collection of points. 

(i) For 1 ^ i ^ n such that Xi £ K (resp. K^), if every Voronoi neighbour of Xi among X is 
also in K (resp. K^), then Diy{X) = 0. 

(ii) For every point Xj at Voronoi distance > 2 from some Xi £ X, Dij(p(X) = 0. 

Remark 6.4. These properties mean somehow that y is of range 2 with respect to the Voronoi 
tessellation. An analogue of Theorem 16.21 should hold for any functional with finite range, such 
as the perimeter approximation induced by yper- On the other hand, the variance lower bound 
derived in this section is specific to the volume approximation. 

We define for x £ A = (Xi) a finite collection of points, k > 1, 

Rf.(x;X) = sup{||y - x\\ : y £ V(Xi;X),dv(x,Xi;X) ^ k} 

the distance to the furthest point in the cell of a k-th order Voronoi neighbour, with R(x;X) := 
R(}{x; X). If X does not have k-th order neighbours, we put the convention Rf.(x; X) = diam([0, l]'^) = 
\/d. We have obviously 

Vol(V(x; A)) ^ KdR{x; A)"*, x E (6.8) 

where Kd is the volume of the unit sphere in 

Proof of Theorem. \6.A We will use Theorem 15.11 with the functional /(A) = y(X) — E(/?(A). Let 
us start with a crucial bound. 
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Lemma 6.5. Assume that (16.3)) holds. Define for some k >0, the random variable 


Uk 


l{diXi,aK)!iR^iXv,x)}Rki^i] a:) 


d 


Then for some Cd,qd+a,k > 0, 

EC/| ^ S+{K)cd,qd+a,kn-'^-"^^, n>l,q>l. 

Proof. Under this form, it is problematic to give a sharp upper bound because the law of Ri^{Xi;X) 
depends on the position of Xi within [0,1]'^. To inject some stationarity in the problem, we 
will bound Rj^{Xi-,X) = R^^Xi; X^) by introducing a closely related quantity Rk{Xi; X^) whose 
conditional law with respect to X^ is independent of the value of Xi. To this end, introduce the 
process 


X’= [J (X + m), 

rri^lA 


which law is invariant under translations. Remark that given any t E X' has a.s. exactly n 
points in [t, t + l]*^. For x € M'^, call 

Cx = {[x-t,x-t + lf;t E [0,1]'^} = {[y,y + if : y G R‘^,x G[y,y + 1]'^}, 
the family of translates of [0,1]“^ that contain x. Then by stationarity of X' ^ the law yk,n of 

Rk{x,X) := sup Rk{x,X' Cl C) 

CSC, 


does not depend on x (and it is indeed only a function of x and X). Also, for x 

E [0,1]^ [0,1]'^eC,, 

whence Rk{x,X) ^ Rk{x, X). This yields 



(6.9) 

^ / l{d{x,dK)iir}'^‘^ hk,n—l{dr')dx 

(6.10) 

iR+x[0,l]‘* 


^ S+{K)BRki0]Xy^+'^ using (IQD. 

(6.11) 


Let us now bound the probability of the event Rk{0, X) ^ r, for some r ^ 0. If this event is realised, 
there is a A:-th order Voronoi neighbour z G X' olb and a point y in the Voronoi cell of z such that 
||y|| ^ r. There is therefore a sequence of points xi = 0,X2 G X',... ,Xk = z,Xk+i = y such that 
for i < k, Xi and Xj+i are Voronoi neighbours. Since the midpoint Zi of Xi and Xj+i has Xi and Xj+i 
as closest neighbours in (X', 0), the open ball B°(zi, \\xi — Xi+i||/2) has an empty intersection with 
X'. Since z is the point of X' closest to y, B°{{z + y)/2, \\z — y\\/2)TiX = 0 also. We therefore have 
k (possibly empty) open balls Ri,..., Bk, with respective radii r*, i = 1,... , /c, such that [xj, Xj+i] 
is a diameter of Bi, and such that X' has a point in none of them. Since ||?/|| ^ r, the radius of at 
least one of these balls is larger than r/2k. Define 

Iq := min{l ^ i ^ k : ri ^ r/2k}. 
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We have by the triangular inequality ||xjp|| ^ i^rI2k ^ r/2, and the ball B{xiQ,r/2k) is empty of 
points of X' and is contained in [—r, r]'^. It is easy to find 7 d > 0 such that at least one of the cubes 
[9-:9 + ldfYi9 £ n [—r, r]'^ is contained in every ball with radius r/2 contained in [—r, r]'^. 

This yields 

P(/2fc(0, X)^r) ^ P(35 G y^rZ"' n [-r, ■. X' <r^[g,g + y^r]'^ = 0) 

^ #(7X n [-1,1]‘')P([0,0 + 7dr]^ n X' = 0). 

Since #[0,0 + ydr]'^ n X' ^ n for r ^ 7 ^^ and X' n [0,0 + y^r] = X n [0, 0 + y^r] for r ^ y^^, we 
finally have 

P(74(0,X) ^ r) ^ 2‘^y-'^(l - y^r-^)” ^ 2^^exp(-ny^r'^). 

It then follows that for u > 0, 

POO POO 

E74(0, = / P(74(0, X^) > ri/“)dr ^ 2^-'" / exp(-(n - l)yjr''/“)dr 

Jo Jo 

POO 

^ 2'^y-'^(n - 1)-“/'^ / exp(-y^r'^/“)dr. 

Jo 

The conclusion follows by reporting this in (j6.9l) . □ 

Proposition 16.,11 and (16.81) yield for g > 1 

|EL»i/(X)'?| ^ K^EUf. 


Lemma 16.51 implies, for q >1, 

E|Di/(X)|'? ^ Cd,gd+a4S+iK)n-^-‘^/'^, (6.12) 

therefore the second term of the right-hand side of (16.61) follows immediately from the last estimate 
in (j5.1|) . We now state Rhee-Talagrand’s inequality |24] . which then immediately yields (j6.4|) . 

Lemma 6.6 (Rhee-Talagrand’s inequality). Let 'ip{X) he a symmetric measurable functional with 
finite q-th moment . Then for q > 1 

E|V'(X) - EV>(X)|'^ ^ n'?/2cqEDi|V’(X)|« 

with Cq = 2'^(18Y^g')'^ , where 1/q + 1/q' = 1. For q = 2, Stein-Efron’s inequality yields the better 
constant C 2 = 1/2. 

Let us bound the two first terms of m- We need for that to control the maximum radius 
of Voronoi cells over X. We first introduce the event on the circumscribed radii of the Voronoi 
spheres, 

ClniX) = ( max (R(X,;X)) ^ 

where pn = log(n)^/'’*+^^ for e' sufficiently small. We have the following lemma, proved later for the 
sake of readability. 
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Lemma 6.7. For all r/ > 0, n^P(0„(X)'^) —>■ 0 as n —>■ oo. 

To bound the first term of m, let Y,Y', Z be recombinations of {X, X', X}. Introduce the 
event O := 0„(y) nO„(y') nO„(Z') which satisfies P(Q'^) ^ 4P(n„(X)‘^). Recall the fact 

that Dijf{X) can only be non-zero if Xj is at Voronoi distance ^ 2 from Xi, and that Djf{X) can 
only be non-zero if Xj has a Voronoi neighbour which cell touches dK. In the notation of (15.11) . 
we have 


^ 4n-^ptf2^n-^piP{d{Yi,dK) ^ 2n-^/V) + 

^ Ci,2n-5-“/'^p“+a 

for some Cip > 0, whence Proposition 15.,11 and (15.,11) yield nBn{f) ^ C'n~^~°^^'^p^^°‘ for some 
C > 0. With a similar computation, 

El{O}l{Di,2</j(V)^0,Di.395(y')7^0}-C'2V^(-^)^ 

^ 4n-4^4P{\\Yi - y2|| ^ 2n-^/V, ll^i' - yg'H ^ 2n-^/4n,d{Yi,dK) ^ 2n-i/V) + 

^ C2,3n-^-^/4n^", 

from where n^B'^f) ^ for some C" > 0. Therefore the first term of (15.ip is 

bounded by 


up to a constant, which yields the first term of (16.6p . It remains to bound the term 

E|/(V)||D,/(X^)|3 

from (15.ip . Recall that under all Voronoi cells volumes, and therefore all |i4j/(V"^)|,l ^ 

j ^ n, are bounded by Kdn'~^Pnj and also, Djf{X4 = 0 if Xj and V' are at distance more than 
2n~^^'^Pn from K's boundary. We have 


nf{X)D,f{x4f ^ E {\f[X)\\D,f[x4\Hn4x4)+n^n{X) 


^ cn-V^"E 


^ cn E 


\fiX)\l 


{Xj or X'.&dK^^ 


+ P(I2„(Vr 


+ IA 


or j J + )• 


We have 


E\Djf{X)\ ^ 


by (I6.12p . while the other term is bounded by independence by 

^ 2E|/(W)|P (Xj E 
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Finally, for some C > 0, 

E\f{X)Djf{X^)f ^ Cn-3-"/'^log(n)3+^/2(^lQg(„)a/rf+£/2 
which gives the desired bound. 

□ 

Proof of Lemma \ 6. 1\ We can find a constant 7 d > 0 such that the intersection with [0,1]'^ of 
every ball centred in [0,1]*^ of radius r ^ 1 contains a cube g + [0,'yd'i"]'^ for some g G If 

maxi^j^„X) > then two Voronoi neighbours Xi,Xj are at distance more than 

n-i/dp^ from one another, and the open ball with diameter [Xi,Xj] does not contain points of X, 
by the construction of the Voronoi tessellation. It follows that a cube g + [0, C [0, l]'^ 

is empty of points of X, for some g G and this event happens with a probability 

bounded by 

(7dn-'/V)-"P([0,7<in-'/V]" n V = 0) ^ 7,-Vn"(l - Vn)" 

^ 7 d‘^np-^exp{nlog{l - 7d?^"Vn)) 

^ Vn''exp(-7ilog(n)i+*'), 

which proves the result. □ 

Proof of Theorem \6.1l It only remains to prove the lower bound on the variance in (j6.5l) . Lemma 
El states that the variance is larger than ^II^II^ 2 qq where 

h{x) =Eip{X^,x)-E^{X), xGiO,!]"'. 


We decompose h as follows: 

h{x) = (Ep(V\ x) - p(V^) - (Ep(V) - p(V^)), X G [0,1]'^ 

=: hi{x) — /i 2 . (6.13) 

Voronoi volume approximation is not homogeneous in the sense that points falling close to iL’s 
boundary have more influence than other points of V„. The following lemma shows that this 
inhomogeneity makes hi the dominant term in the previous decomposition. 


Lemma 6.8. Let K be a measurable subset of [0,1]^, debne hi as in id. 13]) . Then we have 



hi{xfdx > Cd{l{K,n-^l^) + -f{K^,n-^/'^))n-‘^ 


for some Cd > 0. 

Let us first conclude the proof of Theorem 16.11 If the weak rolling ball condition is satisfied 
along with ()6.5I1 , it yields 


'[ 0 , 1 ]“ 


hi{xfdx > CdS-{K)-f{K){n-^/‘^) 


“n-2. 


According to Lemma 16.51 /i 2 = 0(n ^ which is indeed negligible with respect to ||/ii||j ;^2 > 

Cd,Kn-^-^/^<^. □ 
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Proof of Lemma |g.<gi It follows from (j6.7p that for x E K'^ 


\ipix,X^) - ipiX^)\ ='£hx,^K}vix,Xf,X^), 

1=2 

where we notice that the summand distribution does not depend on j. Then 

\hi{x)\ > 1 (x E (n - 1)B1{X2&k}v{x,X2;X^) 

> 1 ("x E dK'f (n — 1)E f v{x,y; X^’‘^)dy 

> 1 fx E 977? (n — l)Vol(S(x,n K) inf Eu(x, y; 

If for some y E [0,l]‘^,e > 0, no point of X^’^ := (Xi)j^i 2 falls in B{y,Qe), then B{y,3e) C 
If furthermore x E [0, 1]*^ lies at distance less than e from y, then with z = x + e||x — 
y\\~^{x - y), 

B{z, e) C E(x, (X^’^y)) C B{y, 3e) C E(y; 
and therefore x(x,?/;X^’^) > We finally have 

inf Eu(x, y; X^’"^) > Kd^'^n"^P(X^’2 ^ = 0 ) > 

for some > 0. With a completely similar result for x E X, we have for some > 0 

Vol(S(x,/3n-i/‘^)nX)2dx+ [ Vol(S(x,/3n-^/'^) nX‘=)2dx 

i/d JdK^ 

Remark 6.9. All three terms of (jS.ip give in the case of Theorem l6.1l a bound of order 7 ^-i/ 2 +o/ 2 rf log(n)'^ 
for some g > 0. In these conditions it seems hard to reach a Berry-Essen bound negligible with a 
better magnitude than 7 ^-i/ 2 +a/ 2 (i^ removing the log is an open problem. 

□ 


Jw 


hi{x)‘^dx > 


ldK2 



6.2 Covering processes 

Let (X, Jlf) be the space of compact subsets of endowed with the hit-and-miss topology and a 
Borel probability measure v. Let be a cube of volume n, and Ci, ..., iid uniform variables 
in En, called the germs. Let n iid compact sets Xi, ..., X„ be distributed as u, called the grains, 
and define the germ-grain process Xj = Cj -|- Xj, i = 1, ... ,n. An important feature of the model 
regarding Gaussian approximation is the radius 

Ri := sup{||x|| : x E Xj}, 1 ^ i ^ n. 

We consider the random closed set formed by the union of the grains translated by the germs 

Fr, = (u^iXfc) n x„. 
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We are interested in the volume of covered by 

fv{Xi,...,Xn)=Yol{Fn), 

the number of isolated grains 

//(Xi, ...,Xn) = #{k: x^nXj nEn = 9,k^ j}, 

and their centred versions with unit variance /y, //• The functional fy denotes the total volume of 
the germ-grain process, and n~^fv{Xi, ... ,Xn) can serve as an estimator for the fraction volume, 
i.e. the portion of the space occupied by the boolean model and therefore be used in 

estimating the parameters of u (see m for insights on the boolean model statistics). 

Kolmogorov Berry-Essen bounds in for binomial input for fy or // have only been 

obtained very recently in m with balls with deterministic identical radii (with the possibility 
to extend the method to a random radius), using size-biased couplings. Chatterjee [1] obtained 
similar bounds in Wasserstein distance. We present here the first such bounds in the unbounded 
random grain context. Furthermore, the computations are quite straightforward and the method 
is generalisable to similar local functionals of the boolean model, such as the perimeter, or other 
Minkowski functionals. The use of the bound (15.ip is crucial to have a decay in in the 

context of random grains. The variance is a straightforward computation of integral geometry, it 
is a consequence of for instance m Th. 4.4] that under the conditions of the theorem below, we 
have cn ^ Var/(Xi,..., X„) ^ Cn for some c, C > 0, for / = /y or / = fj. 

Theorem 6.10. Assume that < oo. Let N be a standard Gaussian variable. Then we have 

for some C > 0, 

dK{fv{Xi,...,Xn),N)^Cn-^/\ 

If'ERf'^ < oo, for some C > 0, 

dK{fi{Xy...,Xn),N)^C'n-^/^. 

Proof. Let first / = fy. Given a n-tuple x = (xi,... ,x„) G /C”, we have Dijf{x) = 0 as soon as 
Vol(xjnxj) = 0, which gives us a sufficient condition. Let us estimate the right hand side of (15.ip . 
Introduce independent copies X',X of X, and for U a random compact set among those families, 
denote by c{U), r{U), K{U) its centre, radius, and grain, so that 

{c(W), c(X'), c(W), K{Xi), X(X'), X(W), 

is a family of independent variables. Let us write V) = Vol(Xj), Vf = Vol(X'). We have \Dify{X)\ ^ 
Vi, and since the volume has a finite moment of order 5, 

supE|Zli/(X)p < oo, supE|Zli/(X)|^ < oo. 

n>l n>l 

We also have for A C [n] 

E|/(X)||Z1,/(X^)|3 ^ E|/(X^’)Z1,/(X^)|3 + B\D,f{X)D,fixy\ 

^ E|/(X^)|(K/ + {V'f) + ED,fiX)^ 

^ E|/(X^')|2EK/ + EVj{Xf, 
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whence 




for some C > 0. 

To estimate Bn{f), we use Proposition 15.31 (|5.2p . and (15.3p . Fix Y,Y',Z recombinations 

of {X, X', X}, we have 


^ El{y2nyi7^0}Vol(Zi)"' 


^ E 


K>(Zirp(c(y2) G i?(c(yi),r(yi)+ r(y2))|li,^i,r(i"2)) 


^n-'4Er(Zi)^'='(r(yi)+r(y2)) 


whence sup„ nB^if) < oo since < oo. 

Then, 

El{^3i,2/(y)^o,Di,3/(y')7^o}^2/(y)^ 

< E [voi(Z2)"i{Di2/(y)^0}P(c(y3') G i?(c(y/),r(y/) + r(y3'))|y2,yi,y2,yi',r(y3'))] 

^ n-V^E [r(Z2)"(r(yi') +r(y3'))‘'P(c(y2) G B(c(yi), r(yi) + r(y2))|Z2, Pi, P/, Ps, KTa)) 

< n-\^Er(y2)^(r(yi') + r{Y^)Y{r{Yi) + r{Y2)f. 


Using the definition of recombinations, the variables Y(,Z 2 ,Y^ are pairwise independent, and the 
expectation above is finite because of ErpPi)^'^ < oo. We indeed have sup„n^i?^(/) < oo, which 
concludes the proof for the Kolmogorov bound on fy- 


Dealing with / = // is slightly more complicated. Introduce dij{X) the distance between i 
and j in the germ-grain process X, defined as the smallest number q such that there is a chain 
ii = i,... ,ig = j such that n / 0. Call Bf (X) the number of points at distance ^ p from 
the point i for the distance d.^.{X). For some 1 ^ i,j ^ n, the value of the functional 

is isolated} ■ ^{XjnXf;nEn=(l),kj^j} 

can be affected by the removal of X* only if Xj n Xj ^ 0, therefore, for 1 ^ ^ n, 

\DifjiX)\ ^ #Bl{X), 

whence, 

E|Di//(X)|'? ^ E#B}{Xy, q ^ 1. (6.14) 

We will estimate this bound later. With the same notation than for the functional fv, let us now 
deal with Bn{f), B'j^{f). Remark that Dijfi{X) = 0 if dij{X) > 2. We have 

Bnif) ^ sup E 1 ^ 2 eBj(Y)}#Bl{Z)‘^ 


l{2es?(y)} ^ X] 
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To simplify notation, remark that for Y,Z recombinations of {X,X',X}, ^B^{Y) ^ 
where T is the concatenation of Y and Z and is in fact composed of m iid variables distributed as 
Xi, where n ^ m ^ 2n. We then have 


Bnif) 


^ sup 

n^m^2n 


eEi 

fc=l 


{Tinrfc^ 0 ,TfcnT 27 ^ 0 } 


E 


'-{W.nTi/0,i=l,.,4}, 


(6.15) 




and the supremum is reached for m = 2n. We have similarly, with m = 3n, 


l{Tinrfc^0,r2nrfc^0} l{TinT^^,^9,T3nT^^,m ^{Tinn.m- (6.16) 

k=l k'=l \<.= (ki,k2,k^,k4)£[m]'^ 

To estimate (j6.14D - (j6.16D . it is useful to introduce some more notation. Call graph on [n] the finite 
data of distinct edges t = ..., {iq,jq}}- For such a graph, introduce the probability 


Pit) = P(r,, n T,, / 0,..., n r,^/0). 


Say that this graph is a tree when it is connected and has no cycles. Let us prove that for every 
tree t with q distinct vertices, 

p{t) < (dKdn“^)‘'“^Er(ri)(^“^)'^. (6.17) 


Let t be such a tree, and let an arbitrary vertex zq of t, designated to be the root of t. Call 
Gk{t),k > 1, the members of the fc-th generation, noticing that there can not be more than q 
generations, i.e. Qkit) = 0 for A: > g. Call Q^it) = '^j<kGjit),Qk{t) = Qkit) \^^(t), and call 
Q^'^^it) the collection of all pairs (z, j) such that i E Qkik),j E Qk+iit)., {i,j} E t. We have 


pit) ^ E 


<E 


{TinTjj^(/i-,{i,j}et-,i,jegq (z)} 

P (c(r,) E BiciTi),riTi) + riT,)y,ii,j) E G^.^it) c(r,),z E ^-(f); riTi),i E [m^ 




n n ^KdiriTi) +riTj))^ 






^{TinT,7^0;{z,i}ez,z,ieS5-(z)} H (’’(Fz)+?’(?)))“ 

{i,j}&:i,j&g+(t) 


Applying this procedure inductively back until the 1-st generation , that is the root zq of the tree, 
yields 


p(t) ^ (zv^n ^)^k>i#Sk 


n (r(r,) + r(r,))" 

{i,j)eUkgt+\t) 
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Now, contains all the q — 1 edges of t, whence 

p{t) ^ H {r{Ti) + r{T,))‘^ ^ {dKdn-y-^Er{Ti)^’i-^'>‘^, 

by using Cauchy-Schwarz inequality, whence (|6.17p follows. 

We have 


lc={ki,...,ke)G[m]^ 

for some C > 0, by using Er(Xi)®'^ < oo, which treats all the terms of (|5.ip except the ones 
containing Bn{f) and 

We call, for ui,..., Ug distinct integers, Z > 0,p > 4, 

= {k= {h,...,kp) G [ml'P : #{ui,... ,Uq,ki,... ,kp}} = q + l. 

We can easily prove that there are constants Ci not depending on m such that 

^ ' ( 6 - 18 ) 
We have, for T with 2n iid components, using ()6.15p . 

n 

Bn{f)^Y P{{'^^k},{2,k},{l,ki};i = 1,... ,A) 

k=l k=(fci)s[2n]'* 

5 

<E E = 2, ... ,5). 

1=0 keH5_2,^ 

For k G [iT^]i 2 -i^ easily extract a tree with I + 1 edges from {{1, fci}, {2, fei}, {1, A:*}; i = 

2,..., 5}, whence p6.17p yields 

5 

Bn{f)^CY E 

1=0 ke[m]f 2 ., 

using also (16.1811 . This gives sup„ ni?n(/) < oo. Similar computations yield 

b:.(/)<e^ 

l{TinTfe7^0,r2nTfe7^0} E ^{TnTj,,^0,T3nr^,^0} E 

k k' \c=(ki,k2^k^,k4)£[m]'^ 

^ E A:2},{1, = 3,... ,6) 

k=(fci)e[m]® 

6 

= E E 7'({2,A:i},{3,A:2},{l,A:i},z = 1,...,6) 

1=0 k=(fci)eHf 2,3;i 
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and for k G [m]® 2 3-1 one can extract a tree with I + 2 edges from {{2, ki}, {3, ^ 2 }, {1, ki}] i = 1...6}, 
whence 

6 

Kif) ^ E E i'^ddn-y+^ ^ Cn-\ 

1=0 ke[m]f 2_3.( 

which concludes the proof. 

□ 

6.3 Further applications 

It is proved in [1] that, in the notation of Theorem 02 ] and for cr = 1 , 


dw{W,N) ^61 + 62 

(6.19) 

5i : = v'Var(E(r|A)) 

(6.20) 

<52 : = 2c^E|A,/(A)|3 

(6.21) 


i=i 


where dw is the 1-Wasserstein distance. This bound has been successfully applied in and [ 21 ] 

to several normal approximation problems. Without fully developing the details, we indicate here 
how we can obtain similar bounds in the Kolmogorov’s distance by using the techniques developed 
in this paper. Assuming that cr = 1, the new terms in (|4.2p with respect to (I6.19P are 

= VVar(E(r'|A)) 

5^=6^y/E|Z),/(A)|6. 

i=i 

The term <5^ is very close in its expression to di. In the examples developed below, it is indeed 
possible to apply the bound already derived for (5i to <5'^. The term (5^ has to be dealt with 
separately, it is in general more straightforward. Remark that can be replaced by the bound 
62 = SUP 24 E|/(X)Z)j/(A"^)^| from ()4.ip . which can give a better convergence rate or less 
restrictive hypotheses, but it requires a specific analysis and we do not develop it below. 

Nearest neighbours statistics. Let A: > I, z > 1, let V' : —)• R be a measurable function and 

let 


n 

where the are the k nearest neighbours of Xi among (xi,... ,Xn) for the Euclidean distance, 
ordered by increasing distance to Xi, with an arbitrary tie breaking rule. Given n i.i.d random 
variables Xi ,..., Xn in R'^, in [4] Chatterjee obtains estimates on the Wasserstein distance between 
f{X) and the normal law under the assumptions that for i ^ j, ||Aj — Aj|| is a continuous random 
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variable. He obtains the bounds, for p > 8, 


^ Cd 
^2 ^ Cd 


kH 


8)/2p 


fc^7p 


(J^Jlip ^)/^P 


where jp := (E|V’(^i, • • •, > 0. These bounds are obtained through [H Theorem 2.5], 

which is similar to Theorem 15.11 where our bound on 5'i is already smaller or equal to the bound 
on (5i from [U Theorem 2.5], up to a constant, see Remark 15.41 Therefore we have 5'i ^ C5i. In 
order to obtain an explicit bound on the Kolmogorov distance, it therefore only remains to bound 
52 - In [4] it is shown that Esup”^;^ \/S.jf {X)\^ ^ {m? + from where the bounds 


2/p 


S'l ^ Ck,dn^^'^ Esr”plAj/(X)l^’ ^ Ck,dn^^Pn^^‘^n = Ck,d—r— 


7p 


i=i 


^(p-8)/2p 


3/p 


Esup1A,/(X)1M ^ 


7p^ 


i=i 


^1/2-6/p 


3/p 


(Esup1A,/(A)1H ^52. 


easily follow. We observe that in [4] a more general situation is actually considered : for each z, a 
different functional ^|Ji is applied to in the definition of /. However, all the explicit 

examples developed in such reference are purely geometric, in the sense that this subtlety is not 
exploited, and the functional f{X) is symmetric. These examples includes the average distance 
to the nearest neighbour, the degree count in the nearest-neighbour graph, and the Levina-Bickel 
statistic with parameter k, which is defined by 



Flux through a random conductor. In [2T|, Nolen considers the solution of an elliptic partial differ¬ 
ential equation with a stationary random conductivity coefficient a{x) over the torus [0, L)'^, L > 0. 
The random function a{x) depends on the local contributions of a set of i.i.d variables Z = 
{Zi,..., Zk) indexed by n [ 0 ,L)'^. He derives a bound on the Wasserstein distance between 
the normal law and the average flux T{Z) of the solution. He obtains the bounds 

(5i ^ log(L) (e^o^) , (6 .22) 

52 ^ Cz7-3l-2'^E4>[;, (6.23) 

where cP is the variance and 4>o is an integral related to the gradient of the solution over [0,1)'^ 
(see [2T] for details). 

Our method allows one to extend this result to the Kolmogorov distance, under slightly stronger 
assumptions. Gloria and Nolen m have also used Theorem 4.2 for a Kolmogorov Berry-Essen 
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bound with a discretised version of the problem. Once again, the simple inequality ||a| — \b\\ ^ 
|o — 6|, a, 6 G M, yields that the upper bound on Var(r(Z, Z')\Z') derived in [211 (2.25)-(2.27)] and 
then used in (4.53) can be used in an exact similar fashion to bound Var(T'(Z, Z')\Z') where T' is 
defined as in our Theorem 14.21 This yields that 5'i satisfies the same bound as (5i, up to a constant. 
Then, ED Lemma 4.1] provides the estimate 

E|Ajr(Z)|'? ^ 

which readily yields the first term of 16.221 and the bound on the Kolmogorov distance 

+ <52 + <5'i + 5^ ^ C(<5i + L-^VeI^oP^). 

Note that the new condition E|4>oP^ < oo might be weakened if one uses (14.ip instead of (14.21) . as 
it is done in the proof of Theorem 16.11 
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